Overview

Dataset statistics

Number of variables11
Number of observations207
Missing cells248
Missing cells (%)10.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.0 KiB
Average record size in memory148.5 B

Variable types

NUM10
CAT1

Reproduction

Analysis started2020-04-22 20:53:25.804877
Analysis finished2020-04-22 20:53:40.551633
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Exporters has a high cardinality: 207 distinct values High cardinality
Share in world exports (%) is highly correlated with Value exported in 2019 (USD thousand)High Correlation
Value exported in 2019 (USD thousand) is highly correlated with Share in world exports (%)High Correlation
Annual growth in value between 2015-2019 (%) has 4 (1.9%) missing values Missing
Annual growth in value between 2018-2019 (%) has 14 (6.8%) missing values Missing
CO2 emission (tons) has 178 (86.0%) missing values Missing
Average distance of importing countries (km) has 13 (6.3%) missing values Missing
Concentration of importing countries has 4 (1.9%) missing values Missing
Ease of doing business ranking has 35 (16.9%) missing values Missing
Annual growth in value between 2015-2019 (%) has 6 (2.9%) zeros Zeros
Annual growth in value between 2018-2019 (%) has 3 (1.4%) zeros Zeros
Share in world exports (%) has 155 (74.9%) zeros Zeros

Variables

Exporters
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE
Distinct count207
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Uganda
 
1
Taipei, Chinese
 
1
Saudi Arabia
 
1
Peru
 
1
Montenegro
 
1
Other values (202)
202
ValueCountFrequency (%) 
Uganda 1 0.5%
 
Taipei, Chinese 1 0.5%
 
Saudi Arabia 1 0.5%
 
Peru 1 0.5%
 
Montenegro 1 0.5%
 
Somalia 1 0.5%
 
Slovakia 1 0.5%
 
Lebanon 1 0.5%
 
Romania 1 0.5%
 
Malawi 1 0.5%
 
Other values (197) 197 95.2%
 

Length

Max length38
Mean length10.49275362
Min length4
ValueCountFrequency (%) 
Lowercase_Letter 28 48.3%
 
Uppercase_Letter 24 41.4%
 
Other_Punctuation 3 5.2%
 
Space_Separator 1 1.7%
 
Open_Punctuation 1 1.7%
 
Close_Punctuation 1 1.7%
 
ValueCountFrequency (%) 
Latin 52 89.7%
 
Common 6 10.3%
 
ValueCountFrequency (%) 
ASCII 56 100.0%
 

Value exported in 2019 (USD thousand)
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count188
Unique (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3066513.807
Minimum1
Maximum89227214
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum1
5-th percentile3
Q179
median5184
Q3313214
95-th percentile15202485.5
Maximum89227214
Range89227213
Interquartile range (IQR)313135

Descriptive statistics

Standard deviation11744290.91
Coefficient of variation (CV)3.829850981
Kurtosis29.06917631
Mean3066513.807
Median Absolute Deviation (MAD)5138491.311
Skewness5.179453316
Sum634768358
Variance1.37928369e+14
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.00000000e+00 1.50000000e+00 1.25000000e+01 7.85000000e+01 2.08000000e+02 ... 6.13550000e+04 4.50162500e+05 1.41218700e+06 1.22553455e+07 8.92272140e+07], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 8 3.9%
 
3 3 1.4%
 
5 3 1.4%
 
35 3 1.4%
 
21 2 1.0%
 
12 2 1.0%
 
79 2 1.0%
 
23 2 1.0%
 
9 2 1.0%
 
50 2 1.0%
 
Other values (178) 178 86.0%
 
ValueCountFrequency (%) 
1 8 3.9%
 
2 1 0.5%
 
3 3 1.4%
 
4 1 0.5%
 
5 3 1.4%
 
ValueCountFrequency (%) 
89227214 1 0.5%
 
83047883 1 0.5%
 
53608528 1 0.5%
 
53561670 1 0.5%
 
53555313 1 0.5%
 
Distinct count207
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-134720.4783
Minimum-74676311
Maximum51485644
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum-74676311
5-th percentile-3118929.3
Q1-492968.5
median-142927
Q3-15378
95-th percentile4579306
Maximum51485644
Range126161955
Interquartile range (IQR)477590.5

Descriptive statistics

Standard deviation8150828.069
Coefficient of variation (CV)-60.50177504
Kurtosis46.89847465
Mean-134720.4783
Median Absolute Deviation (MAD)2165206.172
Skewness-1.474549126
Sum-27887139
Variance6.643599822e+13
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-7.46763110e+07 -5.75995950e+06 -2.35602850e+06 -5.42083000e+05 -2.30434500e+05 ... -8.24250000e+03 1.98000000e+02 1.11696500e+05 1.47591075e+07 5.14856440e+07], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-1071611 1 0.5%
 
-114350 1 0.5%
 
-13507 1 0.5%
 
-10944 1 0.5%
 
-793229 1 0.5%
 
-67773 1 0.5%
 
-94904 1 0.5%
 
-1036983 1 0.5%
 
-1728694 1 0.5%
 
-150373 1 0.5%
 
Other values (197) 197 95.2%
 
ValueCountFrequency (%) 
-74676311 1 0.5%
 
-24448675 1 0.5%
 
-20856636 1 0.5%
 
-13225187 1 0.5%
 
-6114326 1 0.5%
 
ValueCountFrequency (%) 
51485644 1 0.5%
 
45340758 1 0.5%
 
32519888 1 0.5%
 
15700287 1 0.5%
 
13817928 1 0.5%
 
Distinct count93
Unique (%)45.8%
Missing4
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean29.96551724
Minimum-88
Maximum1079
Zeros6
Zeros (%)2.9%
Memory size1.7 KiB

Quantile statistics

Minimum-88
5-th percentile-44.8
Q1-8.5
median5
Q319
95-th percentile158
Maximum1079
Range1167
Interquartile range (IQR)27.5

Descriptive statistics

Standard deviation123.86584
Coefficient of variation (CV)4.133612614
Kurtosis33.60196207
Mean29.96551724
Median Absolute Deviation (MAD)55.35179208
Skewness5.307713804
Sum6083
Variance15342.74633
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
8 11 5.3%
 
-6 7 3.4%
 
0 6 2.9%
 
9 6 2.9%
 
19 5 2.4%
 
7 5 2.4%
 
-11 5 2.4%
 
5 5 2.4%
 
2 5 2.4%
 
12 5 2.4%
 
Other values (83) 143 69.1%
 
ValueCountFrequency (%) 
-88 1 0.5%
 
-74 1 0.5%
 
-61 1 0.5%
 
-59 1 0.5%
 
-55 1 0.5%
 
ValueCountFrequency (%) 
1079 1 0.5%
 
644 1 0.5%
 
621 1 0.5%
 
595 1 0.5%
 
545 1 0.5%
 
Distinct count117
Unique (%)60.6%
Missing14
Missing (%)6.8%
Infinite0
Infinite (%)0.0%
Mean180.8290155
Minimum-100
Maximum15053
Zeros3
Zeros (%)1.4%
Memory size1.7 KiB

Quantile statistics

Minimum-100
5-th percentile-68
Q1-14
median6
Q327
95-th percentile396.8
Maximum15053
Range15153
Interquartile range (IQR)41

Descriptive statistics

Standard deviation1217.535659
Coefficient of variation (CV)6.733076851
Kurtosis122.0152381
Mean180.8290155
Median Absolute Deviation (MAD)317.7902762
Skewness10.57315135
Sum34900
Variance1482393.08
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 7 3.4%
 
3 7 3.4%
 
13 5 2.4%
 
11 5 2.4%
 
12 5 2.4%
 
15 4 1.9%
 
8 4 1.9%
 
-15 4 1.9%
 
6 4 1.9%
 
7 4 1.9%
 
Other values (107) 144 69.6%
 
(Missing) 14 6.8%
 
ValueCountFrequency (%) 
-100 2 1.0%
 
-95 1 0.5%
 
-94 1 0.5%
 
-88 1 0.5%
 
-87 1 0.5%
 
ValueCountFrequency (%) 
15053 1 0.5%
 
7020 1 0.5%
 
2422 1 0.5%
 
1557 1 0.5%
 
1528 1 0.5%
 

Share in world exports (%)
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count22
Unique (%)10.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4801932367
Minimum0
Maximum14.1
Zeros155
Zeros (%)74.9%
Memory size1.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.05
95-th percentile2.38
Maximum14.1
Range14.1
Interquartile range (IQR)0.05

Descriptive statistics

Standard deviation1.849860651
Coefficient of variation (CV)3.852325501
Kurtosis29.25524139
Mean0.4801932367
Median Absolute Deviation (MAD)0.8086489766
Skewness5.191796232
Sum99.4
Variance3.421984428
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.05 0.25 1.95 14.1 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 155 74.9%
 
0.1 19 9.2%
 
0.2 6 2.9%
 
0.6 3 1.4%
 
8.4 3 1.4%
 
1 2 1.0%
 
0.5 2 1.0%
 
0.3 2 1.0%
 
1.3 2 1.0%
 
2.8 1 0.5%
 
Other values (12) 12 5.8%
 
ValueCountFrequency (%) 
0 155 74.9%
 
0.1 19 9.2%
 
0.2 6 2.9%
 
0.3 2 1.0%
 
0.5 2 1.0%
 
ValueCountFrequency (%) 
14.1 1 0.5%
 
13.1 1 0.5%
 
8.4 3 1.4%
 
7.6 1 0.5%
 
5.6 1 0.5%
 

CO2 emission (tons)
Real number (ℝ≥0)

MISSING
Distinct count29
Unique (%)100.0%
Missing178
Missing (%)86.0%
Infinite0
Infinite (%)0.0%
Mean56483.62069
Minimum34
Maximum529696
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum34
5-th percentile359.6
Q12022
median10517
Q353725
95-th percentile233525.2
Maximum529696
Range529662
Interquartile range (IQR)51703

Descriptive statistics

Standard deviation110248.3207
Coefficient of variation (CV)1.951863555
Kurtosis12.54537901
Mean56483.62069
Median Absolute Deviation (MAD)64598.25208
Skewness3.340583628
Sum1638025
Variance1.215469222e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
10517 1 0.5%
 
3512 1 0.5%
 
3456 1 0.5%
 
572 1 0.5%
 
681 1 0.5%
 
34 1 0.5%
 
1226 1 0.5%
 
218 1 0.5%
 
53725 1 0.5%
 
158980 1 0.5%
 
Other values (19) 19 9.2%
 
(Missing) 178 86.0%
 
ValueCountFrequency (%) 
34 1 0.5%
 
218 1 0.5%
 
572 1 0.5%
 
681 1 0.5%
 
749 1 0.5%
 
ValueCountFrequency (%) 
529696 1 0.5%
 
283222 1 0.5%
 
158980 1 0.5%
 
127130 1 0.5%
 
102741 1 0.5%
 
Distinct count192
Unique (%)99.0%
Missing13
Missing (%)6.3%
Infinite0
Infinite (%)0.0%
Mean4453.623711
Minimum304
Maximum16537
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum304
5-th percentile784.45
Q11758
median3356.5
Q35769.5
95-th percentile12332.1
Maximum16537
Range16233
Interquartile range (IQR)4011.5

Descriptive statistics

Standard deviation3659.802173
Coefficient of variation (CV)0.8217582827
Kurtosis1.660468794
Mean4453.623711
Median Absolute Deviation (MAD)2762.511213
Skewness1.43210932
Sum864003
Variance13394151.95
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4284 2 1.0%
 
3924 2 1.0%
 
3834 1 0.5%
 
2689 1 0.5%
 
2938 1 0.5%
 
2590 1 0.5%
 
2631 1 0.5%
 
1273 1 0.5%
 
5059 1 0.5%
 
3838 1 0.5%
 
Other values (182) 182 87.9%
 
(Missing) 13 6.3%
 
ValueCountFrequency (%) 
304 1 0.5%
 
347 1 0.5%
 
361 1 0.5%
 
485 1 0.5%
 
496 1 0.5%
 
ValueCountFrequency (%) 
16537 1 0.5%
 
16303 1 0.5%
 
15645 1 0.5%
 
15643 1 0.5%
 
15087 1 0.5%
 

Concentration of importing countries
Real number (ℝ≥0)

MISSING
Distinct count77
Unique (%)37.9%
Missing4
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean0.4190640394
Minimum0.03
Maximum1
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum0.03
5-th percentile0.071
Q10.145
median0.34
Q30.615
95-th percentile1
Maximum1
Range0.97
Interquartile range (IQR)0.47

Descriptive statistics

Standard deviation0.302263099
Coefficient of variation (CV)0.7212814047
Kurtosis-0.749260843
Mean0.4190640394
Median Absolute Deviation (MAD)0.2528219564
Skewness0.6741197598
Sum85.07
Variance0.09136298103
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 19 9.2%
 
0.09 13 6.3%
 
0.13 6 2.9%
 
0.14 6 2.9%
 
0.41 5 2.4%
 
0.08 5 2.4%
 
0.06 5 2.4%
 
0.12 5 2.4%
 
0.34 5 2.4%
 
0.46 5 2.4%
 
Other values (67) 129 62.3%
 
ValueCountFrequency (%) 
0.03 1 0.5%
 
0.05 1 0.5%
 
0.06 5 2.4%
 
0.07 4 1.9%
 
0.08 5 2.4%
 
ValueCountFrequency (%) 
1 19 9.2%
 
0.98 2 1.0%
 
0.95 2 1.0%
 
0.94 2 1.0%
 
0.93 1 0.5%
 

Ease of doing business ranking
Real number (ℝ≥0)

MISSING
Distinct count171
Unique (%)99.4%
Missing35
Missing (%)16.9%
Infinite0
Infinite (%)0.0%
Mean91.23837209
Minimum1
Maximum190
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum1
5-th percentile9.55
Q144.5
median89.5
Q3137.25
95-th percentile179.45
Maximum190
Range189
Interquartile range (IQR)92.75

Descriptive statistics

Standard deviation54.39648825
Coefficient of variation (CV)0.5962018721
Kurtosis-1.173752196
Mean91.23837209
Median Absolute Deviation (MAD)46.96484586
Skewness0.08027328543
Sum15693
Variance2958.977934
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
171 2 1.0%
 
182 1 0.5%
 
104 1 0.5%
 
69 1 0.5%
 
85 1 0.5%
 
56 1 0.5%
 
6 1 0.5%
 
37 1 0.5%
 
98 1 0.5%
 
48 1 0.5%
 
Other values (161) 161 77.8%
 
(Missing) 35 16.9%
 
ValueCountFrequency (%) 
1 1 0.5%
 
2 1 0.5%
 
3 1 0.5%
 
4 1 0.5%
 
5 1 0.5%
 
ValueCountFrequency (%) 
190 1 0.5%
 
189 1 0.5%
 
188 1 0.5%
 
186 1 0.5%
 
184 1 0.5%
 

Unnamed: 10
Real number (ℝ≥0)

Distinct count206
Unique (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09664171935
Minimum0.05370496261
Maximum1
Zeros0
Zeros (%)0.0%
Memory size1.7 KiB

Quantile statistics

Minimum0.05370496261
5-th percentile0.06040500315
Q10.06763877281
median0.07836891345
Q30.1007718614
95-th percentile0.16473467
Maximum1
Range0.9462950374
Interquartile range (IQR)0.03313308862

Descriptive statistics

Standard deviation0.07830990974
Coefficient of variation (CV)0.8103116363
Kurtosis89.58902127
Mean0.09664171935
Median Absolute Deviation (MAD)0.03361090223
Skewness8.426921421
Sum20.00483591
Variance0.006132441964
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.05370496 0.05933764 0.09053005 0.11776359 0.21538462 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.2 2 1.0%
 
0.07149122807 1 0.5%
 
0.08649215346 1 0.5%
 
0.09581257271 1 0.5%
 
0.1032842328 1 0.5%
 
0.08803310623 1 0.5%
 
0.07651976761 1 0.5%
 
0.156628392 1 0.5%
 
0.1031390135 1 0.5%
 
0.1168831169 1 0.5%
 
Other values (196) 196 94.7%
 
ValueCountFrequency (%) 
0.05370496261 1 0.5%
 
0.05675287356 1 0.5%
 
0.0579004329 1 0.5%
 
0.05801263642 1 0.5%
 
0.05926934205 1 0.5%
 
ValueCountFrequency (%) 
1 1 0.5%
 
0.5 1 0.5%
 
0.3333333333 1 0.5%
 
0.25 1 0.5%
 
0.2307692308 1 0.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

ExportersValue exported in 2019 (USD thousand)Trade balance in 2019 (USD thousand)Annual growth in value between 2015-2019 (%)Annual growth in value between 2018-2019 (%)Share in world exports (%)CO2 emission (tons)Average distance of importing countries (km)Concentration of importing countriesEase of doing business rankingUnnamed: 10
0Germany89227214325198886.0-7.014.1283222.03572.00.0724.00.140567
1Switzerland83047883514856448.010.013.1NaN4700.00.1138.00.152230
2Belgium5360852872121726.013.08.4127130.03834.00.0945.00.115912
3United States of America53561670-746763113.011.08.4529696.07744.00.068.00.130995
4Ireland535553134534075817.01.08.440714.04083.00.2323.00.150723
5Netherlands483512441570028716.013.07.657987.02904.00.0836.00.160227
6France35554964104566924.05.05.6158980.03609.00.0632.00.140302
7Italy34123303647603515.023.05.472304.03310.00.0951.00.156628
8United Kingdom26928329-1015669-6.0-10.04.2102741.04557.00.099.00.146558
9Denmark175451141309155110.022.02.842041.03838.00.733.00.111888

Last rows

ExportersValue exported in 2019 (USD thousand)Trade balance in 2019 (USD thousand)Annual growth in value between 2015-2019 (%)Annual growth in value between 2018-2019 (%)Share in world exports (%)CO2 emission (tons)Average distance of importing countries (km)Concentration of importing countriesEase of doing business rankingUnnamed: 10
197Cabo Verde3-13947-50.0NaN0.0NaN6442.00.56131.00.230769
198Tuvalu2-124-33.0NaN0.0NaN15071.01.00NaN0.200000
199Ship stores and bunkers1-1111-88.0NaN0.0NaNNaN1.00NaN0.125000
200Sao Tome and Principe1-2297-61.0NaN0.0NaN3924.01.00170.00.142857
201Suriname1-18394-31.0NaN0.0NaN7505.01.00165.00.166667
202Cook Islands1-452NaNNaN0.0NaN3206.01.00NaN0.200000
203Mozambique1-2109660.0-100.00.0NaNNaNNaN135.00.250000
204Mauritania1-49982NaNNaN0.0NaNNaNNaN148.00.333333
205Sint Maarten (Dutch part)1-5948-45.0NaN0.0NaNNaN1.00NaN0.500000
206Haiti1-39425NaNNaN0.0NaNNaNNaN182.01.000000